The equality problem for infinite words generated 
by primitive morphisms 

Julia Honkala 
Department of Mathematics 
University of Turku 
20014 Turku, Finland 
email: juha.honkala@utu.fi 

Abstract 

We study the equality problem for infinite words obtained by iterating 
morphisms. In particular, we give a practical algorithm to decide whether 
or not two words generated by primitive morphisms are equal. 

1 Introduction 

The equality problem for pure morphic words (also called the DOL ^-equivalence 
problem) was solved by Culik II and Harju [2|. A simpler approach was sug- 
gested in [6]. The algorithms given in [21 [6] have very high complexities. No 
practical algorithm is known in the general case. For the equality problem of 
infinite words generated by polynomially bounded morphisms see [3]. 

The purpose of this paper is to give a practical algorithm for the equality 
problem of pure morphic words generated by primitive morphisms. By defini- 
tion, a morphism g : X* — > X* is primitive if there is a positive integer k such 
that for all a,b £ X , a occurs in g k (b). 

To explain our result let X be a finite alphabet and let g : X* — > X* be a 
morphism. Let x € X be a letter such that x is a prefix of g{x). Then g l {x) is a 
prefix of g l+1 (x) for all i > 0. If the length of g l+1 (x) is always greater than the 
length of g l (x), we let g^(x) be the infinite word which has prefix g l (x) for all 
i > 0. Infinite words generated in this way by iterating a morphism are called 
pure morphic words (or infinite DOL words). 

Let now X be an alphabet having n > 2 letters. Define A(n) = [9n 3 \/nlog n\ . 
Let g : X* — ► X* and h : X* — ► X* be primitive morphisms and let x e X 
be a letter such that g"(x) and h u (x) exist. Define fx = g 2n - 2 h 2n ~ 2 and 
f 2 = h 2n ~ 2 g 2n - 2 . We will show that g u {x) = h u (x) if and only if /1 and f 2 
satisfy a balance condition and one of the words fiff^ix) and f^fx \ x ) 1S 
a prefix of the other. 

The paper is structured as follows. In Section 2 we recall some basic defi- 
nitions and earlier results. In particular, we define looping and loop- free mor- 



phisms. Intuitively, a morphism g : X* — > X* is looping if some power of g 
generates a periodic word. In Section 3 we recall the reduction of the equality 
problem to the comparability problem. In Section 4 we study balance properties 
of morphisms. In Section 5 we solve the equality problem for loop-free primitive 
morphisms by using ideas and results from [5] . In Section 6 we solve the equal- 
ity problem for looping primitive morphisms. The main result is presented in 
Section 7. 

We assume that the reader is familiar with the basics concerning free monoid 
morphisms and their iterations, see [TJ O HP] . For morphic words and their 
applications see also [5]- 

2 Preliminaries 

2.1 Basic definitions 

We use standard language-theoretic notation and terminology. In particular, 
the length of a word w is denoted by |to|. If to £ X* is a nonempty word, then 
alph(w) is the set of all letters of X occurring in w. If w is a nonempty word, 
then w u is the infinite word 

w u = www 

If u,v G X* are words, v is a factor of u if there exist words ui,U2 G X* 
such that u — u\vu2- If, furthermore, u\ — e then v is a prefix of u. If q is a 
nonncgative integer and w is a finite word or an infinite word, then Prefj (w) is 
the prefix of length q of w and F q (w) is the set of factors of length q of w. If 
\w\ < q, it is understood that Pref q (w) = w. If L C X*, then 

Pref g (L) = {Pref g (» | w G L} 

and 

F,(L) = |J F,H. 

tuGL 

Let u G X* be a nonempty word and denote u = a% . . . ak where a% G X for 
i = 1, . . . , k. Then a positive integer p is called a period of u if 

a p+i = a; for i=l,...,k — p. (1) 

The smallest positive integer p satisfying ([1]) is called the period of it and is 
denoted by PER(w). 

Let g : X* — ► X* be a morphism and let w G X* be a word. If w is a prefix 
of g(w), then g n (w) is a prefix of c/™ +1 (u;) for all n > 0. If, furthermore, 

lim |<7 n (iu)| = oo, 

n — *oo 

then we denote 

g u (w) = lim g n (w). 

n — >oo 

In all other cases g UJ {w) is not defined. Hence, if g u {w) exists, it is the unique 
infinite word which has the prefix g n (w) for all n > 0. 
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2.2 Primitive morphisms 

Suppose g : X* — ► X* is a morphism. Then g is called primitive if there exists 
a positive integer k such that for all x, y G X, y occurs in g k {x). If the alphabet 
X contains at least two letters, primitive morphisms are growing morphisms. 
In general, a morphism g : X* — ► X* is called growing if for every x G X we 
have 

— > oo when i — > oo. 
If 5 : X* — ► X* is a morphism, define 

M g = max{|5(x)| iel} 

and 

CYCLIC (g) = {x G X \ g(x) contains at least one occurrence of x}. 

The following lemma gives some basic properties of primitive morphisms. 

Lemma 1 Let X be an alphabet having n > 2 letters and let g : X* — > X* be 
a primitive morphism. Then 

(i) \g n (x)\ > M g for all x G X . 

(li) //CYCLIC^) £ 0, then alph( 3 2 "- 2 (a;)) = X for all x G X. 

Proof. Let X. Because g is primitive, there is an integer i such that y 

occurs in g l (x) and i < n — 1. Hence |g n (x)| > |<?(y)|. This implies (i). 

Assume then that z G CYCLIC^). Then alph(g" _1 (2;)) = X. Furthermore, 
if x G X, then z occurs in g n ~ 1 (x). These facts imply (ii). □ 

For the proof of the following lemma see [1] . 

Lemma 2 Let X be an alphabet with n > 2 letters and let g : X* — > X* be a 
growing morphism. Let w G X*. Then 

Pref 2 ({ ff » \i>0}) = Pref 2 ({ 5 l ( W ) | i = 0, 1, . . . , 3n - 2}) 

and 

F 3 ({g l (w) | t > 0» = F 3 ({g l (w) | * = 0, 1, . . . , 2n 2 + 2n~ 3}). 

2.3 Looping and loop-free morphisms 

Let g : X* — > X* be a morphism. Then we say that g is looping if there exist 
a positive integer k, a letter x G X and a nonempty word u such that (g k ) u '(x) 
exists and 

(g k T(x) = u". 

If g is not looping, we say that g is loop-free. 

Loop- free morphisms avoid small periods in the sense of the following lemma 

0. 
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Lemma 3 Let X be an alphabet having n letters and let g : X* — ► X* be a 
morphism. If g is growing and loop-free, then 

Percys)) > ^\g d (x)\ 

whenever i£X and d is a positive integer. 

For the proof of the next lemma see [7]. 

Lemma 4 Let X be an alphabet having n letters and let g : X* — ► X* be a 
primitive morphism. Assume that x G X , k is a positive integer and 

(g k r(x) = u", 

where u G X* is a primitive word. Then 

\g 2n (z)\>2\u\ 

for all z G X . 

3 Properties of infinite words 

In this section we recall some results concerning infinite words generated by 
morphisms. 

Let g : X* — ► X* be a morphism and let x G X be a letter such that g u {x) 
exists. If w is a nonempty prefix of g w (x), then w is a proper prefix of g(w). 

Lemma 5 Let gi : X* ► X* , i = 1,2, be morphisms and let x G X be a letter 

such that gf{x) exists for i = 1,2. Suppose 

gf(x) =wbi... 

for i = 1,2, where w G X* , bi G X and l»i / ()2. If i±, . . . , ik G {1, 2}, then 
either gi k . . . gi t (x) is a prefix of w or has prefix wbi k . 

Proof. Consider the word gi k ■ ■ ■ gi t (x) and assume that the claim holds for 
9ih-i ■ ■ -9ii( x )- If 9ik-i ■ ■ -9iA x ) is a prefix of w, the claim holds. Assume w 
is prefix of gi k _ 1 ■ ■ .gi 1 (x). Then gi k (w) is a prefix of gi k . . .gi 1 (x). The claim 
follows because gi k (w) is a prefix of gf (x) which is longer than w. □ 

Lemma 6 Let g : X* — ► X* and h : X* ► X* be morphisms and let x G X 

be a letter such that g u (x), h w (x), (gh) w (x) and (hg)^(x) exist. Then 

gT(x) = h»{x) (2) 

if and only if 

(ghr(x) = {hgY(x). (3) 
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Proof. Assume first that © holds. If u E Pref(g"(x)), then gh{u) € Prefix" (a;)). 
Hence 

(ghY(x) S Prefix)) 

for all i > 0. Therefore 

(gh) a (x) = <flx). 

Similarly 

Hence ([3]) holds. 

Assume then that © does not hold. Let 

= raa • • • , ft"(x) = wb - ■ ■ , 

where w € X*, a, 6 € A and a b. Then for large values of i by Lemma 5, wa 
is a prefix of (ghy(x) and u>6 is a prefix of (hg) l (x). Hence ([3]) does not hold. 
□ 

Let g : A* — > X* and ft, : X* — > X* be morphisms. Then the set 
COMP( 5 , ft) is defined by 

COMP(g, ft) = {w G X* | there exist ui, u 2 G A* such that 
g(w)ux = h(w)u 2 }. 

Lemma 7 Let g : X* — > A* and ft : A* > A* &e morphisms and let x G X 

be a letter such that g^ix) and ft w (x) exist. Then 

g»(x) = h"{x) (4) 

if and only if 

Pref( 5 "(x)) C COMP(j,/i). (5) 

Proof. Assume first that Q holds. Assume that m is a prefix of = 
h u (x). Then g(u>) is a prefix of g u {x) and ft(ui) is a prefix of h u (x). Hence 
w e COMP( ff ,ft). 

Assume then that (0| does not hold. Let 

g u (x) — wa ■ • • , h u (x) = wb - ■ ■ , 

where w € A*, a, 6 £ A and a ^ b. Then g(w) has prefix wa and /i(m;) has 
prefix wb. Because a ^ b, the word w is not in COMP(g, ft) which shows that 
(0 does not hold. □ 

4 Balance properties of morphisms 

Let g : X* — > X* and ft : A* — > A* be morphisms. Define 

BAL(g, ft) = max{\\gg k {x)\ - \hg k (x)\\ \ x e A, fc > 0}, 

where the right-hand side is a nonnegative integer or oo. 
For the proof of the following lemma see [3]. 
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Lemma 8 Let X be an alphabet having n > 2 letters and let g : X* — ► X* 
and h : X* — > X* be morphisms. Define M — max{M g , M^}. If 

BAL(g, h) < oo, 

i/ien 

BAL( 5 , ft) < M 2 ™- 1 exp(n 2 (l + yJ&n\ogn)). 

Next we recall a very important result due to Culik II and Harju [2J. Let 
g : X* — ► X* be a morphism and let a G X be a letter. Define X\ = X — {a}. 
Then the triple (X,g,d) is called a 1-system if 

(i) <?(a) G aA7, 

(ii) <7(x) € XI whenever x £ X\, 
(hi) {g l (a) \ i > 0} is infinite, 

(iv) if x € Xi, then a; occurs infinitely many times in g aJ (a). 

A 1-system (X, g, a) is called 1- simple if the restriction of g to X% is primitive. 
For the proof of the following result see [5] • 

Theorem 9 Let (X,gi,a), i — 1,2, be 1-systems such that <?x( a ) = 92 ( a )- -f 
(X,gig2,a) and {X,g%gi,a) are 1- simple then there exists a positive integer K 
such that 

\\gm(w)\ - \g29i{w)\\ < K 

whenever w is a prefix of gf (a) . 

In what follows we need the following consequence of Theorem 9. 

Lemma 10 Let Xi be a finite alphabet. Let gi : X* — > X^ and g 2 : X* — ► 
Xi be primitive morphisms and let x € X% be a letter such that gi(x) and g^ix) 
exist and are equal. If g±g2 and giQi are primitive then there exists a positive 
integer K such that 

\\gi92{w)\ - \g29i(w)\\ < K 
whenever w is a factor ofgi(x). 

Proof. It is enough to prove the claim for the prefixes of gi(x). This claim is 
a consequence of Theorem 9. Indeed, choose a new letter a £ X\ and define 
X = X\ U {a}. Assume that gi(x) = xui (i = 1,2) and extend g^ by gi(a) = 
aui (i = 1,2). Then (X 7 g i7 a), i — 1,2, are 1-systems and gf{a) = g% (a). 
Furthermore, (X, gxgi-, a) and (X, 17251,0) are 1-simple. Hence Theorem 9 is 
applicable. □ 
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5 The equality problem for loop-free primitive 
morphisms 

5.1 The comparability problem 

In this subsection we first recall a result from [5]. 

Let A be an alphabet having n > 2 letters and let g : A* — > X* and 
h : X* — ► X* be growing morphisms. Define the mapping (3 : X* — ► Z by 

0(w) = \g(w)\ - \h(w)\, weX*. 

Assume that t > 2 is a fixed integer. Assume that d is a positive integer 
such that we have 

PER(.g d (z)) > j\g d (z)\ for all z e X. (6) 

Define 

e = d + 2n - 1 



and 



S = max|/3( 5 e (z))|. 



Further, assume that m is a positive integer such that 

\g d (z)\>2tm for all zeX (7) 

and 

Define 

Li = COMP(/ l5 e , 5ff e ). 
For the proof of the next lemma see [SI Lemma 9] . 

Lemma 11 Let w G X* . Then 

w E L\ 

if and only if 

Pref 2 (w) € Li and F 3 (w) C F 3 (L X ). 

Now we use Lemmas 2 and 11 to show that for a word w G A* we can decide 
the inclusion 

{5* (to) | i > 0} C COMP(g,/i) 

by checking whether or not g l (w) G COMP(g, ft,) for a certain number of initial 
values of i. 
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Lemma 12 Let w £ X*. Then 

{<?>) |<>0}CCOMP(ff,/i) 

i/ and only if 

{g l (w) | i = 0, 1, . . . , e + 2n 2 + 2n - 3} C COMP( 5 , ft). 

Proof. To prove the nontrivial part assume that g t {w) £ COMP(g,h) for i — 
0, 1, . . . , e + 2n 2 + 2n - 3. Then € Li for i = 0, 1, . . . , 2n 2 + 2n - 3. Hence 

Pref 2 ( 5 >)) G il and iWM) C F 3 (Li) (9) 

for i = 0, 1, . . . , 2n 2 + 2n — 3. Now Lemma 2 implies that © holds for all i > 0. 
Therefore Lemma 11 implies that <?*(u>) £ Ii for all i > 0. In other words 
g e+z (w) £ COMP (g,h) for all i > 0. Hence g z (w) £ COMP(g,/i) for all i > 0. 
□ 

5.2 The equality problem 

In this subsection we solve the equality problem for pure morphic words gener- 
ated by loop- free primitive morphisms under certain additional assumptions. 

Lemma 13 Let X be an alphabet having n > 2 letters. Define A(n) = 
^n^y/n log n\ . Let g : X* ► X* and h : X* — ► X* be primitive mor- 
phisms. Let x £ X be a letter such that g u {x) and h u (x) exist. Assume that g 
is loop-free, BAL(g,/i) < oo and M% < M 2 . Then 

g u (x) = h"(x) (10) 

if and only if 

g A{n) {x) £ COMP(g,h). (11) 

Proof. First, if dTUJ) holds, Lemma 7 implies ifTTj) . 

Assume then that (fTTj) holds. Define the integers t, m, d, e and B as follows. 
First, define 

t = 2M™ 

and 

m = \(n 2 + l)M 2( - 2n -^ exp(n 2 (l + v^nlogn))]. 
Let q be a real number such that 

2tm = M«. 

Then define 

d=([q\+l)n, e = d + 2n-l. 

Finally, define 

B = max|/5( 3 e (z))|. 
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Then 

e + 2n 2 + 2n-2 < 9n 3 y/n\ogn. (12) 

Now we are in a position to apply Lemma 12. First, g and h are growing 
because g and h are primitive and card(X) > 2. Because g is loop-free, Lemma 
3 implies ©■ By Lemma 1 we have 

|.9 d MI > M^ +1 > 2tm 

for all z E X. Hence ([7]) holds. By Lemma 8 also © holds. Observe that the 
assumption Mh < M 2 implies that M = max{M 3 ,M^} < M?. Consequently 
the assumptions needed for Lemma 12 hold. By (fT2|) Lemma 12 implies that 
g l {x) E COMP(5, h) for all i > 0. Now Lemma 7 implies $W$). □ 



6 The equality problem for looping primitive 
morphisms 

In this section we solve the equality problem for pure morphic words generated 
by looping primitive morphisms under some additional assumptions. 

Lemma 14 Let X be an alphabet having n > 2 letters. Let g : X* — ► X* and 

h : X* ► X* be primitive morphisms. Let x € X be a letter such that 

and h^(x) exist. Assume that g is looping and BAL(g, h) < oo. Then 

g«>(x) = h"(x) (13) 

if and only if 

g 2n (x)€COMP(g,h). (14) 

Proof. Because g is looping there exist a positive integer k, a letter z € X and 
a nonempty primitive word v such that 

(g k r(z)=v". 

Because g is primitive, each letter of X occurs in v. Because g k (v) G v* it 
follows that g kl (x) is a factor of for all i > 0. This implies that there is a 
conjugate u of v such that for infinitely many values of i, the word g kl {x) is a 
prefix of w". Because g kl (x) is a prefix of g UJ {x) for alii > 0, we have 

g u (x) = u". (15) 

By Lemma 4 we have 

l5 2 "(z)l>2M. (16) 

Because BAL(g, ft,) < oo, we have \g(u)\ = \h(u)\. 

Assume now that p^l) holds. By (fT5|) and (fTB]) . the word u is a prefix of 
g 2rt (a;). Hence u £ COMP(g,h), which implies that g(u) = h{u). Therefore 
g\x) € COMP( 5 , h) for all i > 0. This implies ([13]) by Lemma 7. 

Conversely, if (fT3| holds, Lemma 7 implies (fT4|) . □ 
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7 The equality problem for primitive morphisms 

In the previous sections we have studied the equality g u (x) = h w (x) with the 
assumption that BAL(g,/i) < oo. It remains to give a necessary and suffi- 
cient condition for the equality g u (x) — h u (x) without the assumption that 
BAL(g, h) < oo. 

Theorem 15 Let X be an alphabet having n > 2 letters. Define A(n) — 
\_9n 3 y/n log n\ . Let g : X* — ► X* and h : X* — > X* be primitive mor- 
phisms and let x £ X be a letter such that g u; {x) and h w (x) exist. Define 
f 1 = g 2n - 2 h 2n ~ 2 and f 2 = h 2n - 2 g 2n ~ 2 . Then ' 



if and only if 
and 



g"(x) = h u (x) 

BAL(A,/ 2 ) <oo 
ff {n \x)ECOMP(f 1 J 2 ). 



(17) 

(18) 
(19) 



Proof. By Lemma 1, 

alph( 5 2 "- 2 (z)) = alph(/i 2 "- 2 (z)) = X (20) 

for all z € X. Hence f\ and f% are primitive morphisms. In particular, f\ and 
fi are growing morphisms and f\(x) and fl£(x) exist. If y, z € X, then by (f20|) 
we have 

|/ 2 (y)| = \h 2n ~ 2 g 2n - 2 (y)\ < \h 2n - 2 g 2n - 2 h 2n - 2 (z)\ < \f 2 (z)\. 

Hence 

M h <M fl <M\. 
Now suppose that (TlT|) holds. Then Lemma 6 implies that 

ff(x) = ft{x). (21) 

Now Lemma 10 implies P^)l and Lemma 7 implies (JTHJ) - 

Assume then that (fT5)) and (HHJ) hold. If f\ is loop-free (resp. looping), then 
Lemma 13 (resp. Lemma 14) implies that (|2"Tj) holds. Then we have (TT7|) by 
Lemma 6. □ 

For a method to decide the condition (TT8|) see [H p. 512], where it is ex- 
plained how one can compute polynomials Pi{z), . . . , P n {z) depending on fx 
and such that (TT8|) holds if and only if for each j there is a positive integer 
P < exp(\/6n logn) such that Pj(^) divides 1 — z v . Hence the degrees of the 
polynomials depend only on the cardinality of the alphabet. 
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